Bayesian Nets for Syntactic Categorization of Novel Words
نویسندگان
چکیده
This paper presents an application of a Dynamic Bayesian Network (DBN) to the task of assigning Part-of-Speech (PoS) tags to novel text. This task is particularly challenging for non-standard corpora, such as Internet lingo, where a large proportion of words are unknown. Previous work reveals that PoS tags depend on a variety of morphological and contextual features. Representing these dependencies in a DBN results into an elegant and effective PoS tagger.
منابع مشابه
Bayesian Nets in Syntactic Categorization of Novel Words
This paper presents an application of a Dynamic Bayesian Network (DBN) to the task of assigning Part-of-Speech (PoS) tags to novel text. This task is particularly challenging for non-standard corpora, such as Internet lingo, where a large proportion of words are unknown. Previous work reveals that PoS tags depend on a variety of morphological and contextual features. Representing these dependen...
متن کاملModeling Syntactic Context Improves Morphological Segmentation
The connection between part-of-speech (POS) categories and morphological properties is well-documented in linguistics but underutilized in text processing systems. This paper proposes a novel model for morphological segmentation that is driven by this connection. Our model learns that words with common affixes are likely to be in the same syntactic category and uses learned syntactic categories...
متن کاملText Categorization Using Predicate-Argument Structures
∗ Most text categorization methods use the vector space model in combination with a representation of documents based on bags of words. As its name indicates, bags of words ignore possible structures in the text and only take into account isolated, unrelated words. Although this limitation is widely acknowledged, most previous attempts to extend the bag-of-words model with more advanced approac...
متن کاملUsing Syntactic and Semantic based Relations for Dialogue Act Recognition
This paper presents a novel approach to dialogue act recognition employing multilevel information features. In addition to features such as context information and words in the utterances, the recognition task utilizes syntactic and semantic relations acquired by information extraction methods. These features are utilized by a Bayesian network classifier for our dialogue act recognition. The ev...
متن کامل